Detecting DNA Modifications from SMRT Sequencing Data by Modeling Sequence Context Dependence of Polymerase Kinetic
نویسندگان
چکیده
DNA modifications such as methylation and DNA damage can play critical regulatory roles in biological systems. Single molecule, real time (SMRT) sequencing technology generates DNA sequences as well as DNA polymerase kinetic information that can be used for the direct detection of DNA modifications. We demonstrate that local sequence context has a strong impact on DNA polymerase kinetics in the neighborhood of the incorporation site during the DNA synthesis reaction, allowing for the possibility of estimating the expected kinetic rate of the enzyme at the incorporation site using kinetic rate information collected from existing SMRT sequencing data (historical data) covering the same local sequence contexts of interest. We develop an Empirical Bayesian hierarchical model for incorporating historical data. Our results show that the model could greatly increase DNA modification detection accuracy, and reduce requirement of control data coverage. For some DNA modifications that have a strong signal, a control sample is not even needed by using historical data as alternative to control. Thus, sequencing costs can be greatly reduced by using the model. We implemented the model in a R package named seqPatch, which is available at https://github.com/zhixingfeng/seqPatch.
منابع مشابه
White Paper - Detecting DNA Base Modifications Using SMRT Sequencing
Traditionally, it has been a challenge to study the wide variety of base modifications that are seen in nature. Most high-throughput techniques focus only on cytosine methylation and involve both bisulfite sequencing to convert unmethylated cytosine nucleotides to uracil nucleotides, and comparison of sequence reads from bisulfite-treated and untreated samples. SMRT sequencing, in contrast, doe...
متن کاملqDNAmod: a statistical model-based tool to reveal intercellular heterogeneity of DNA modification from SMRT sequencing data
In an isogenic cell population, phenotypic heterogeneity among individual cells is common and critical for survival of the population under different environment conditions. DNA modification is an important epigenetic factor that can regulate phenotypic heterogeneity. The single molecule real-time (SMRT) sequencing technology provides a unique platform for detecting a wide range of DNA modifica...
متن کاملAnalysis of RNA base modification and structural rearrangement by single-molecule real-time detection of reverse transcription
BACKGROUND Zero-mode waveguides (ZMWs) are photonic nanostructures that create highly confined optical observation volumes, thereby allowing single-molecule-resolved biophysical studies at relatively high concentrations of fluorescent molecules. This principle has been successfully applied in single-molecule, real-time (SMRT®) DNA sequencing for the detection of DNA sequences and DNA base modif...
متن کاملUnderstanding Accuracy in SMRT Sequencing
Introduction Single Molecule, Real-Time (SMRT) DNA sequencing achieves highly accurate sequencing results, exceeding 99.999% (Q50) accuracy, regardless of the DNA's sequence context or GC content. This is possible because SMRT Sequencing excels in all three categories that are relevant when considering accuracy in DNA sequencing: 1. Consensus accuracy 2. Sequence context bias 3. Mappability of ...
متن کاملReal-time DNA sequencing from single polymerase molecules.
Pacific Biosciences has developed a method for real-time sequencing of single DNA molecules (Eid et al., 2009), with intrinsic sequencing rates of several bases per second and read lengths into the kilobase range. Conceptually, this sequencing approach is based on eavesdropping on the activity of DNA polymerase carrying out template-directed DNA polymerization. Performed in a highly parallel op...
متن کامل